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Abstract —Greedy algorithms for minimizing LO-norm of sparse de¬ 
composition have profound application impact on many signal processing 
problems. In the sparse coding setup, given the observations y and the 
rednndant dictionary ^>, one would seek the most sparse coefficient 
(signal) X with a constraint on approximation fidelity. In this work, we 
propose a greedy algorithm based on the classic orthogonal matching 
pnrsuit (OMP) with improved sparsity on x and better recovery rate, 
which we name as eOMP. The key ingredient of the eOMP is recursively 
performing one-step orthonormalization on the remaining atoms, and 
evalnating correlations between residnal and orthonormalized atoms. 
We show a proof that the proposed eOMP guarantees to maximize the 
residual reduction at each iteration. Through extensive simulations, we 
show the proposed algorithm has better exact recovery rate on i.i.d. 
Ganssian ensembles with Gaussian signals, and more importantly yields 
smaller LO-norm under the same approximation fidelity compared to the 
original OMP, for both synthetic and practical scenarios. The complexity 
analysis and real rnnning time resnit also show a manageable complexity 
increase over the original OMP. We claim that the proposed algorithm 
has better practical perspective for finding more sparse representations 
than existing greedy algorithms. 

I. Introduction 

T he recent developments in sparse recovery have yielded pro¬ 
found impact on many old and new signal processing problems. 
In the sparse coding setting, people are looking for solving the classic 
inverse problem where the sampling matrix (or dictionary) # con¬ 
structs an underdetermined system. Specifically, given $ £ 
where M ^ N and observations y £ R^, it solves an x £ R^'^ via 
LO-norm minimization problem: 

min |x|o 

s.t. ||y-$x|| 2 <e (1) 

The nature of non-convexity of the LO-norm leads to two genre 
of algorithms for solving the above problem. The first category 
of algorithms seeks to relax the non-convex LO-norm term into a 
more optimization friendly Ll-norm, such as the well-known LASSO 
algorithm jTj. Although it has mathematical guarantee on global 
convergence, the Ll-norm in fact does not precisely measure the 
sparsity, which is the number of non-zero coefficients in the solution 
vector. On the other hand, another group of scholars design a series 
of primarily greedy algorithms to minimize the LO-norm directly. An 
exemplar of such algorithms is the orthogonal matching pursuit Q, 
which iteratively finds the best correlated atom to the residual. More 
recently, several improved algorithms have been proposed. In Q 
and 0, the authors proposed two similar algorithms (CoSaMP and 
subspace pursuit), respectively, that have been proven to have better 
exact recovery conditions and faster convergence, for signals with 
known sparsity. 

Although the new algorithms are promising, we would like to 
note that they are mostly based on the following two assumptions: 
1) all above cited work have proved exact recovery conditions that 
in general require the sampling matrix to be nearly orthonormal 
(as measured by the restricted isometry property Q); and 2) in the 
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synthetic simulations demonstrated, the signals are often constructed 
with known sparsity. This is particularly true for and Q, wherein 
the estimated true sparsity is needed as the algorithm input. We 
would like to note these assumptions hamper the usability of these 
algorithms in the practical sparse coding scenario, where the sampling 
matrix is usually highly correlated and the actual signal sparsity is 
unknown. Indeed, and 0 are targeting a different version of 
Eq.0 which tries to minimize the fidelity with an equality constraint 
on the LO-norm of x. 

From this standpoint, we propose a greedy algorithm based on the 
classic OMP algorithm which better suits the scenario with unknown 
sparsity and correlated sampling matrix. The proposed algorithm 
recursively performs one-step orthonormalization on the remaining 
atoms with respect to previously chosen atoms in the dictionary 
at each iteration, so that the orthnormalized remaining atoms span 
over the null space of the current signal estimate. We show through 
extensive synthetic and real simulations that the proposed algorithm 
is able to achieve better sparsity compared to the classic OMP, with 
a manageable complexity increase. We name the proposed algorithm 
as eOMP because it uses embedded recursive orthonormalization. 

The rest of the paper is organized as follows. In Sec.|^ we lay out 
the details of the proposed eOMP algorithm and we show the proof 
that the proposed eOMP guarantees to maximize residual reduction 
at each iteration; in Sec.[^ we provide extensive simulation results 
for synthetic simulations under two sampling matrix configurations, 
the i.i.d Gaussian random ensemble and the overcomplete DCT 
dictionary, as well as a practical simulation based on coding video 
blocks by finding a sparse representation over a highly redundant 
and correlated self-adaptive dictionary 0); we show the complexity 
analysis and real running time statistics of the proposed algorithm in 
Sec. |IV[ we conclude the paper in Sec. [V] 

IT Improving the OMP Algorithm 

In the classic OMP, at each iteration, the algorithm consists of 
the following three major steps: find the most correlated column 
(atom) among all remaining columns of the sampling matrix with 
the current residual and append the chosen atom index into the set of 
chosen atoms (known as the support), a least squares update of the 
coefficients corresponding to the new support, and at last the residual 
is updated using least squares coefficients. As all greedy algorithms, 
the OMP converges when the residual norm stops decreasing or 
reaches the preset e shown in Eq. [T] 

Before we jump to our proposed algorithm, we introduce an 
alternative way of implementing the original OMP algorithm by 
orthonormalizing the chosen atoms incrementally. Specifically, at 
each iteration, after selecting the chosen atom, we orthonormalize 
it with respect to the previous chosen atoms using the Gram-Schmidt 
algorithm. The coefficient with respect to this orthonormalized atom 
can be then obtained simply by the inner product (i.e. projection) of 
the current residual to this orthonormalized atom. This alternative 
way of implementing OMP is shown in Algorithm [T] We would 
like to note that although the solution vector corresponds to the 
orthonormalized atoms, finding the coefficients to the original chosen 
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atoms can be done after all atoms are chosen by a single least squares 
step. 


Algorithm 1: Alternative OMP algorithm with incremental or- 
thonormalization of chosen atoms _ 

Data: ^ assuming # has zero mean 

and unit variance, and y has zero mean. 

Result: The set of chosen atom index 1C, chosen 

orthonormalized atoms D, solution vector z w.r.t D, 
solution vector x w.r.t #. 

initialize residual r ■(— y, IC — 0, IC‘^ = {1,2,..., M}, ^ 

D = 0, iteration counter f = 1 ; 
while ||r ||2 >= e do 

1. Update remaining atoms ^ $(:, i), Vi £ IC^', 

2. (I>t t— argmax^^g^ K<;^n,r)| and k ^ index of ij>u 

3. Update chosen index set /C /C U {fc} and remaining 
index set KC •<— 

4. Orthonormalize (pt w.r.t D: t/jt _L D; 

5. Augment D [Dlt/it]; 

6. zt -s- (r, t/'t); 

7. Update r r — zttpt', 

8. Augment z [z|zt]; 

9. t^t + l- 
end 

(Optional) Denote the chosen original atoms indexed by 1C as 
^ic, solve least square problem x = 


We note that the solution vector z can be written as 

z = D^y. (2) 


(Step 4), and the coefficient is obtained using the projection to the 
orthonormalized atom (Step 6). 

To maximize the error reduction at each new iteration, eOMP 
algorithm first orthonormalizes all the remaining atoms with respect 
to all previously chosen atoms, and then compute the correlation of 
the current residual with these othonormalized atoms and find the one 
with the largest correlation magnitude. Instead of orthonormalizing 
each remaining original atom with respect to all previously chosen 
atoms, we further propose to perform the orthonormalization recur¬ 
sively, i.e., each time a new atom is chosen, all remaining atoms are 
orthonormalized with respect to this new atom only, and the original 
remaining atoms will he replaced with the newly othonormalized 
ones. 

More specifically, starting at the second iteration, the algorithm 
recursively performs a one-step orthonormalization of all remaining 
atoms with respect to the chosen atoms. Denote the last orthonormal¬ 
ized atom in the chosen set D by d, the one-step orthonormalization 
operates on each remaining atom t/ii hy: 


tpi tpi- {d,i}i)d 





(5) 


The orthonormalized atoms, instead of their original atoms in the 
sampling matrix, are then used to evaluate the correlation with 
the current residual and the one with the highest correlation (in 
magnitude) is chosen as the next atom. The computation of the new 
coefficient and new residual can be done using the newly chosen 
atom, which is already orthonormalized. 

The proposed algorithm is presented in Algorithm]^ We highlight 
the differences of our proposed eOMP from Algorithm [T] 


This is because zt = {rtjtpt} in Step 6 can also be written as 
Zt = {y, tpt) because y can be represented as linear combination of 
previously found orthonormalized atoms (which are all orthogonal to 
tpt) and rt. 

To show Algorithm [T] yields identical solution to the original OMP, 
let us examine the QR decomposition method commonly used for the 
least squares update. For a residual r = y — ^kx, denote the QR 
decomposition on the chosen atom matrix ^/c hy QR. The Q matrix 
is exactly the D matrix in Algorithm[2hy the construction of the QR 
decomposition. To show that the original OMP solution is equivalent 
to Algorithm we just need to show that Rx is exactly equal to Z 
from Algorithm 1, when Q = D. Left multiply on the residual 
equation, we get the following 

Q^r = Q^y - (Q^Q)Rx 

= Q^y - R-x (3) 

To minimize ||r ||2 (least square), it is equivalent to solve the system 
of linear equations by setting Eq.|^to zero. 

min ||r ||2 44 Rx = Q^y (4) 

The right side of the above equation is identical to the solution in 
Eq.|^ Therefore, Algorithm [T] is an alternative way to give the same 
solution as the original OMP. 

Our proposed eOMP algorithm is motivated by Algorithm[2 Notice 
that the update in Step 7 of Algorithm [T] will lead to a residual 
reduction norm of |zt| with each new atom. Therefore, to maximize 
the residual reduction hy each new atom, we should maximize |zt|, 
the magnitude of the correlation of the current residual with the 
othonormalized chosen atom. However, in Step 2, the atom is chosen 
based on the correlation with the original atoms. Only after an atom 
is chosen based on this correlation, then the atom is orthonormalized 


Algorithm 2: Proposed eOMP algorithm with recursive orthonor¬ 
malization of remaining atoms 

Data: $ e e e, assuming $ has zero mean 

and unit variance, and y has zero mean. 

Result: The set of chosen atom index 1C, chosen 

orthonormalized atoms D, solution vector z w.r.t D, 
solution vector x w.r.t 

initialize residual i ■(— y, IC — 0, IC‘^ = [1,2,..., M}, ^ 

D = 0, iteration counter f = 1 ; 
while ||r ||2 >= e do 

1. Update remaining atoms ■<— Vi £ IC^’, 

2. Perform one-step orthonormalization Eq. of all columns 
in 'S' w.r.t tpt-i', 

3. tpt ■(— argmax.^^g,p r)| and k index of tpt\ 

4. Update chosen index set /C Ai U {fc} and remaining 
index set •<— 

5. Augment D ■<— [Dlt/jt]; 

6. Zt *r- (r, tpt)-, 

7. Update r ■<— r — zttpp, 

8. Augment z •<— [z|zt]; 

9. 

end 

(Optional) Denote the chosen original atoms indexed hy K. as 
S'/C, solve least square problem x = ^^y. 


There are a few implementation notes to accelerate the proposed 
algorithm. First, the final least squares step is optional, needed only 
when a solution vector with respect to the original sampling matrix 
$ is desired. Computationally this is equivalent to the last iteration 
solution update in the original OMP. Note that in applications of 











3 


sparse representation for signal compression, it is better to use the 
coefficients associated with orthonormalized atoms to represent a 
signal, as in Second, in Step 3 when selecting the best atom for 
the current iteration, one could also mark the atoms that have near¬ 
zero correlations and exclude them from the consideration in future 
iterations. Last, one could impose additional convergence conditions 
in the while-loop. One commonly used condition is to compare the 
norm of r in the most recent iterations and terminate when the norm 
stops to decrease significantly. 

Now we turn to giving a proof on why eOMP works better. For 
that, we have the following lemma. 

Lemma 1. In each iteration, eOMP maximizes the residual reduction 
II A|| 2 , where A rt_i — rt. 

Proof. Given an orthonormalized atom ip, the residual reduction A 
from using that atom is given by aip, where a = {Tt-i,ip). The 
residual norm is ||A ||2 = |a|. 

In eOMP, Step 3 selects a particular orthonormalized atom to 
exactly maximize |a|, since ipt = argmax^(rt_i, t/i) = a. Therefore 
eOMP guarantees to maximize the residual reduction. □ 

III. Numerical Experiments 

In this section, we show our numerical experiments in both 
synthetic and real settings for the proposed eOMP algorithm and 
compare its performance with original OMR We first show and 
compare the exact recovery rate of the signal x with known sparsity, 
and then the recovered sparsity when the estimate of the observation 
y is numerically identical to y, under unknown sparsity conditions. 

A. Synthetic simulations 

In the first configuration, we consider the exact recovery rate 
experiment. Consider an i.i.d. Gaussian ensemble $ € 
with N = 128, we construct a fc-sparse Gaussian signal x € 
and produce the observations y = $x. We use our proposed eOMP 
algorithm and the original OMP algorithm to run k iterations, and 
compare the exact recovery rate by measuring whether the recovered 
signal X and x are numerically identical. The above simulation is ran 
for 500 realizations. 

Figure [T] compares the exact recovery rate between eOMP and 
OMP. Proposed eOMP algorithm still outperforms the original OMP 
and has a slower dropping rate beyond the critical sparsity point (i.e. 
the point where algorithm starts to fail exactly recovering the signals). 

In the second configuration of synthetic signals, we consider the 
unknown sparsity case. To evaluate, given a sampling matrix $ and 
k, we run both the eOMP and the original OMP until the residual 
norm ||r ||2 reaches machine precision (e < 10“®). We then measure 
the LO-norm of recovered signals x, which we term as the recovered 
sparsity. We consider two types of sampling matrix, the i.i.d. Gaussian 
ensemble of R^^^^ and the overcomplete DCT (ODCT) with 2x, 
4x, and 8x number of signal dimension. Note that the ODCT atoms 
have higher mutual coherence than the i.i.d. Gaussian ensemble. The 
above simulation is also ran for 500 realizations for each k. 

Figure shows the recovered sparsity by eOMP and OMP for 
i.i.d. Gaussian ensembles. Our simulations show that eOMP results 
in lower recovered K than OMP over the entire range of true signal 
sparsity K examined (1 to 80). However, to show the difference more 
clearly, we only show the k range in which the proposed eOMP 
significantly differs from the original OMP. The result shows when 
k € [40, 70], the proposed eOMP can achieve up to more than 20% 
LO-norm reduction than OMP. 


Gaussian signal, Gaussian random matrix, 500 realizations 



Fig. 1. Exact recovery rate for Gaussian sampling matrix 


Gaussian signal, Gaussian random matrix, 500 realizations 
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Fig. 2. Recovered sparsity for Gaussian sampling matrix 


Figure shows the recovered sparsity of eOMP and OMP for 
ODCT dictionaries. In all 3 ODCT configurations, we consistently 
see a large sparsity saving comparing to OMP. The sparsity saving in 
ODCT cases is larger than the Gaussian ensemble case, which implies 
our proposed eOMP is more efficient for finding sparse solution 
when the sampling matrix $ is more mutually coherent. We note 
that it is well-knwon that practical audio and image signals can be 
represented efficiently using DCT and ODCT. Therefore, this is a 
very encouraging result. 


B. Real simulations 

In this section, we show a simulation by adopting our proposed 
eOMP on a realistic application. We examine the gain of the proposed 
eOMP over the original OMP in the video sparse representation 
proposed in Q. The essence of © is to find a sparse representation 
of a video frame block, by using a self-adaptive redundant dictionary 
consisting of all shifted blocks in the previous frame over a large 
search range. The motivation for using such a dictionary to represent 
a block is that the current block is likely to be very close to a 
few atoms in the dictionary. In the special case when the current 
block is exactly equal to a shifted block in the previous frame due 
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Signal sparsity 


Seq. footbail 



Fig. 5. PSNR vs. K for Seq. Football 


Fig. 3. Recovered sparsity for overcomplete DCT basis 


Seq. Stockholm 



Fig. 4. PSNR vs. K for Seq. Stockholm 


to motion, the sparsity is 1. In the example examined here, the block 
size is 16 x 16, and the search range of (—23 ~ 24) shift in both 
horizontal and vertical direction is used, leading to a dictionary of 
size = 16 X 16, M = 2304. Furthermore, we do not know the 
true sparsity of each block. 

We show the result as PSNR vs. K plots, where the vertical 
is the recovery fidelity measured by PSNR, and the horizontal is 
the average LO-norm over all the blocks in a frame. To generate 
a curve, we vary the recovery fidelity threshold e to approximate 
different expected distortions of applying uniform quantization on 
the coefficients with respect to the orthonormalized atoms. Assuming 
the bits spent on each non-zero entry is roughly the same, the higher 
curve would indicate better quality for the same bit rate. Figures 
and|^show PSNR vs. K for two video sequences called Stockholm 
and Football, respectively. By using the proposed algorithm, the 
recovered K for the same reconstruction fidelity (i.e. same PSNR) 
reduces by up to 30%, which translates to substantially reduced bit 
rate. This is impressive improvement as this simulation was ran on 
real signals (images) with a real dictionary that is highly redundant 
and correlated. 


IV. Complexity Analysis 


The major difference between the original OMP (implemented 
using Algorithmic and eOMP algorithms is the need of recursively 
performing one-step orthonormalization for all the remaining atoms. 
Although it sounds tedious, the complexity analysis we will show 
here implies manageable complexity increase. First, without loss of 
generality, assume the majority of the complexity coming from mul¬ 
tiplications. Therefore an element-wise multiplication of a vector in 
is of complexity 0{N). The one-step orthonormalization shown 
in Eq. |C includes two inner products and two scaling. To perform 
a one-step orthonormalization, the total complexity is 0(4A). Note 
that this has to be done for each remaining atom. 

The rest of the ingredients are shared among two algorithms. The 
correlation calculation between the residual and each dictionary atom 
takes a complexity of 0(A); the residual/solution update takes a 
complexity of 0{2N) in each iteration. For OMP using Algorithm[C 
orthonormalization one atom with respect to k previous chosen atoms 
has a complexity of O {{2k + 2)N). Assuming no other numerical 
trick is used, this complexity is similar to that of solving a least 
square update by QR decomposition. 

Now, consider the case that we run both algorithms on a sampling 
matrix of with signal in R^ for s iterations. For original 

OMP algorithm in Algorithm [T] note that the set of chosen atoms is 
always increasing, so the orthonormalization process takes a complex¬ 
ity of O ((1 4- ... -b s — 1) 2N + 2sN) for s iterations. Therefore 
the total complexity is: 


0{{K + K-1 +... +K - s + l)N) + 

O ((1 -F ... -F s - 1) 2A -b 2sN) + O {2sN) 
'{2K-s + 7)s 


= O 


+ s{s-l)]N 


( 6 ) 


For the eOMR one-step orthonormalization is applied to all remaining 
atoms, which is decreasing in number, so the second term is replaced 
by O {{K — 1-F...-FA' — s-Fl) 4A) for s iterations. Therefore 
the total complexity is: 


0{{K + K-l + ... + K-s + l)N) + 
0{{K-l + ... + K-s + l)‘iN) + 0 {2sN) 


= O 


^^ (2A-s-F5)s 


-F2(2A' 


s)(s- 1) 



(7) 


Assuming K ^ s, Eq. and Eq. are approximately 0{NKs) 
and 0{5NKs), respectively. The ratio of two complexity terms is 
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TABLE I 

Total running time (in seconds) 


Signal sparsity 

10 

20 

30 

40 

50 

60 

OMP in SparseLab 


0.8614 

2.393 

4.076 

5.506 

6.418 

6.678 

OMP in Algorithm 


0.6078 

1.709 

2.889 

3.866 

4.542 

4.666 

Proposed eOMP 

5.119 

11.30 

17.45 

25.64 

33.71 

39.79 


constantly around 5. To validate this, we measure the total running 
time for the 2 x overcomplete DCT simulation shown in Sec. |III-A| for 
both the original OMP and the eOMP. For the original OMP, we use 
the implementation provided in SparseLab software, which solves 
the least squares update using QR decomposition. Table |I] lists the 
total running time for all three algorithms, implemented in Matlab, 
for 500 trials of each given K. We see the ratios are around 5 x. The 
slightly larger ratio may be due to the inefficient implementation of 
our eOMP algorithm. 


V. Conclusion 

In this work, we propose an improved OMP algorithm eOMP, 
by recursively orthonormalizing the remaining atoms. We have 
proved the eOMP algorithm maximizes the residual reduction at 
each iteration. Through extensive numerical simulations with both 
synthetic and real data, we conclude the proposed eOMP algorithm 
can consistently achieve better recovered sparsity and higher exact 
recovery rate than the original OMP. The gain is more significant 
when the signal’s actual sparsity is unknown and the sampling matrix 
(dictionary) is highly correlated. Our complexity analysis shows that 
the proposed eOMP algorithm has the same order of magnitude 
of complexity as the original OMP, roughly equals 5 times more 
complexity than that of the original OMP when the sparsity is 
significantly smaller than the dictionary size. 

The work reported in this paper is by no means complete: we 
have yet to provide a mathematical proof on the exact recovery 
condition, and we are looking to possible further ways to reducing the 
running time. There are several recent theoretical developments on 
the greedy LO-norm algorithms but as far as we known, none of them 
are designed with a clear application perspective. For example, the 
algorithm needs to know the actual signal sparsity, which is typically 
unknown in practical application. Our goal is to pursue a better, more 
application friendly LO-norm minimization algorithm. 
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